{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "← Go back to [Querying and downloading cartographic material from loc.gov](maps-downloading-querying.ipynb) \n", "\n", "# Analyzing and visualizing cartographic metadata from loc.gov\n", "\n", "
Table of Contents
\n", "\n", "- [This notebook](#This-notebook)\n", "- [1. Required Prep: Install and import all of the Python modules we'll need](#1.-Required-Prep:-Install-and-import-all-of-the-Python-modules-we'll-need)\n", "- [2. Query for list of records](#2.-Query-for-list-of-records)\n", "- [3. Harvest the metadata](#3.-Harvest-the-metadata)\n", "- [4. Basic metadata analysis](#4.-Basic-metadata-analysis)\n", "- [5. Metadata charts](#5.-Metadata-charts)\n", "- [6. Metadata maps and Wikidata](#6.-Metadata-maps-and-Wikidata)\n", "\n", "More Resources
\n", " \n", "Other Jupter notebooks and examples from the Library of Congress can be found at LC for Robots. \n", "\n", "\n", "Run the next cell to:
\n", " \n", "import modules. \n", " \n", "↓ ↓ ↓ ↓ ↓ ↓\n", "\n", "Run the next cell to:
\n", " \n", "create the `get_item_ids` function.\n", " \n", "↓ ↓ ↓ ↓ ↓ ↓\n", "\n", "Run the next cell to:
\n", "\n", "generate a list of Springfield atlas records, and save their IDs to the variable `ids`. The cell will print out the number of results. This is the number of atlases downloaded in the last Jupyter notebook, and it is the number of rows our metadata CSV spreadsheet will have. We can use the `len` function to tell us the length of `ids`.\n", "\n", "If you'd like to try a different query, update the URL in the `searchURL` variable.\n", "\n", "↓ ↓ ↓ ↓ ↓ ↓\n", "Run the next cell to:
\n", "\n", "look at how the metadata in an item record is structured, by looking at the first item in `ids`.\n", "\n", "↓ ↓ ↓ ↓ ↓ ↓\n", "Tip
\n", " \n", "Most web browsers have built-in JSON viewers. Try copying the link above labelled \"View item record json in your web browser\" into various browsers, to see how they each visualize JSON. \n", "\n", "You can also view the full JSON item record in this Jupyter notebook by running `item_metadata` or `print(item_metadata)`.\n", "\n", "How is the metadata structured?
\n", "\n", "Notice in the output above that the `Notes` and `Location` values are in square brackets: \n", "\n", "```\n", "Notes: \n", "[' Feb 1884. ', ' 15. ']\n", "\n", "Location: \n", "[{'illinois': 'https://www.loc.gov/search/?fa=location:illinois&fo=json'}, {'sangamon county': 'https://www.loc.gov/search/?fa=location:sangamon+county&fo=json'}, {'springfield': 'https://www.loc.gov/search/?fa=location:springfield&fo=json'}]\n", "```\n", "\n", "This indicates that these values are Python lists, containing multiple values. Each value in the list is separated by a comma. In `Notes`, there are two values in the list, and each is surrounded by single quotes. \n", "\n", "In loc.gov, `Notes` is a general field that can contain a range of information. For Sanborns, it is used to hold publication date and number of sheets. In this example, the `15` indicates 15 sheets in this particular Sanborn atlas.\n", "\n", "The structure of the `Location` field is a little more complicated. It's a list, but each item in the list is a Python dictionary. The example `Location` above is a list of three dictionaries. The first dictionary in the list looks like this:\n", "\n", "```\n", "{'illinois': 'https://www.loc.gov/search/?fa=location:illinois&fo=json'}\n", "```\n", "\n", "Curly brackets are used around dictionaries; square brackets are around lists. This dictionary has a single \"key:value\" pair. The key is \"illinois\". The value is a URL. \n", "\n", "The URL isn't useful for our purposes. For this example item, `Location` has three dictionaries, and their keys are: \"illinois\", \"sangamon county\", and \"springfield\". This tells us that the atlas is in Springfield, Sangamon County, Illinois. Notice that the metadata itself doesn't indicate any hierarchy between the locations.\n", "\n", "When we harvest our metadata, we're going to save it into a CSV file. For this CSV, any fields that contain lists will be split into muiltple columns. For location, we'll get the keys (like \"illinois\", \"sangamon county\", \"springfield\") and we'll ignore the URLs. \n", "\n", "Let's create a function that will harvest item record metadata into a CSV.
\n", "\n", "\n", "The function will: \n", "- take our list of items (`ids`)\n", "- use the API to request each item record as JSON\n", "- extract the metadata fields we want from each record\n", "- add those fields to something Python calls a \"dataframe\"\n", "- save the dataframe as a CSV file\n", "\n", "The functions in this notebook are large and do many things. In the real world, best practice for Python functions is usually to break them up into multiple small functions, where each function does one thing.\n", "\n", "Run the next cell to:
\n", "\n", "create function `get_metadata_from_ids` for harvesting and saving our metadata to a CSV file. \n", "\n", "↓ ↓ ↓ ↓ ↓ ↓\n", "Let's run the function above.
\n", "\n", "The function will create a Python dataframe of our metadata. A dataframe is like a super-powered table for data calculation and analysis. We'll save that dataframe to a CSV file. The CSV file will be saved in the `saveTo` folder that you specify below. The filename will be `metadata.csv`.\n", "\n", "Run the next cell to:
\n", "\n", "run the function `get_metadata_from_ids` and save the results to a CSV named `metadata.csv` into the folder you specify in the `saveTo` variable.\n", "\n", "Be sure to edit `saveTo` to point to a folder on your computer.\n", "\n", " \n", "↓ ↓ ↓ ↓ ↓ ↓\n", "\n", "\n", " | date | \n", "id | \n", "id_simplified | \n", "location0 | \n", "location1 | \n", "location2 | \n", "notes0 | \n", "notes1 | \n", "page_count | \n", "title | \n", "notes2 | \n", "location3 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "1884-02 | \n", "http://www.loc.gov/item/sanborn02163_001/ | \n", "sanborn02163_001 | \n", "illinois | \n", "sangamon county | \n", "springfield | \n", "Feb 1884. | \n", "15. | \n", "15.0 | \n", "Sanborn Fire Insurance Map from Springfield, S... | \n", "NaN | \n", "NaN | \n", "
1 | \n", "1890-07 | \n", "http://www.loc.gov/item/sanborn02163_002/ | \n", "sanborn02163_002 | \n", "illinois | \n", "sangamon county | \n", "springfield | \n", "Jul 1890. | \n", "26. | \n", "26.0 | \n", "Sanborn Fire Insurance Map from Springfield, S... | \n", "NaN | \n", "NaN | \n", "
2 | \n", "1896 | \n", "http://www.loc.gov/item/sanborn02163_003/ | \n", "sanborn02163_003 | \n", "illinois | \n", "sangamon county | \n", "springfield | \n", "1896. | \n", "81. | \n", "81.0 | \n", "Sanborn Fire Insurance Map from Springfield, S... | \n", "Map of congested district. Bound. | \n", "NaN | \n", "
3 | \n", "1886-08 | \n", "http://www.loc.gov/item/sanborn03245_001/ | \n", "sanborn03245_001 | \n", "kentucky | \n", "springfield | \n", "washington county | \n", "Aug 1886. | \n", "2. | \n", "2.0 | \n", "Sanborn Fire Insurance Map from Springfield, W... | \n", "NaN | \n", "NaN | \n", "
4 | \n", "1893-02 | \n", "http://www.loc.gov/item/sanborn03245_002/ | \n", "sanborn03245_002 | \n", "kentucky | \n", "springfield | \n", "washington county | \n", "Feb 1893. | \n", "2. | \n", "2.0 | \n", "Sanborn Fire Insurance Map from Springfield, W... | \n", "NaN | \n", "NaN | \n", "
5 | \n", "1898-10 | \n", "http://www.loc.gov/item/sanborn03245_003/ | \n", "sanborn03245_003 | \n", "kentucky | \n", "springfield | \n", "washington county | \n", "Oct 1898. | \n", "3. | \n", "3.0 | \n", "Sanborn Fire Insurance Map from Springfield, W... | \n", "NaN | \n", "NaN | \n", "
6 | \n", "1886 | \n", "http://www.loc.gov/item/sanborn03858_001/ | \n", "sanborn03858_001 | \n", "hampden county | \n", "massachusetts | \n", "springfield | \n", "1886. | \n", "35. | \n", "35.0 | \n", "Sanborn Fire Insurance Map from Springfield, H... | \n", "Bound. | \n", "NaN | \n", "
7 | \n", "1896 | \n", "http://www.loc.gov/item/sanborn03858_002/ | \n", "sanborn03858_002 | \n", "hampden county | \n", "massachusetts | \n", "springfield | \n", "1896. | \n", "85. | \n", "93.0 | \n", "Sanborn Fire Insurance Map from Springfield, H... | \n", "6 skeleton maps. | \n", "NaN | \n", "
8 | \n", "1894-06 | \n", "http://www.loc.gov/item/sanborn04392_001/ | \n", "sanborn04392_001 | \n", "brown county | \n", "minnesota | \n", "springfield | \n", "Jun 1894. | \n", "3. | \n", "3.0 | \n", "Sanborn Fire Insurance Map from Springfield, B... | \n", "NaN | \n", "NaN | \n", "
9 | \n", "1899-12 | \n", "http://www.loc.gov/item/sanborn04392_002/ | \n", "sanborn04392_002 | \n", "brown county | \n", "minnesota | \n", "springfield | \n", "Dec 1899. | \n", "3. | \n", "3.0 | \n", "Sanborn Fire Insurance Map from Springfield, B... | \n", "NaN | \n", "NaN | \n", "
10 | \n", "1884-04 | \n", "http://www.loc.gov/item/sanborn04881_001/ | \n", "sanborn04881_001 | \n", "greene county | \n", "missouri | \n", "springfield | \n", "Apr 1884. | \n", "8. | \n", "8.0 | \n", "Sanborn Fire Insurance Map from Springfield, G... | \n", "NaN | \n", "NaN | \n", "
11 | \n", "1886-11-06 | \n", "http://www.loc.gov/item/sanborn04881_002/ | \n", "sanborn04881_002 | \n", "greene county | \n", "missouri | \n", "north springfield | \n", "Nov 6 1886. | \n", "14. | \n", "14.0 | \n", "Sanborn Fire Insurance Map from Springfield, G... | \n", "North Springfield. | \n", "springfield | \n", "
12 | \n", "1891-06 | \n", "http://www.loc.gov/item/sanborn04881_003/ | \n", "sanborn04881_003 | \n", "greene county | \n", "missouri | \n", "springfield | \n", "Jun 1891. | \n", "23. | \n", "23.0 | \n", "Sanborn Fire Insurance Map from Springfield, G... | \n", "NaN | \n", "NaN | \n", "
13 | \n", "1896-07 | \n", "http://www.loc.gov/item/sanborn04881_004/ | \n", "sanborn04881_004 | \n", "greene county | \n", "missouri | \n", "springfield | \n", "Jul 1896. | \n", "26. | \n", "26.0 | \n", "Sanborn Fire Insurance Map from Springfield, G... | \n", "NaN | \n", "NaN | \n", "
14 | \n", "1886-09 | \n", "http://www.loc.gov/item/sanborn06900_001/ | \n", "sanborn06900_001 | \n", "clark county | \n", "ohio | \n", "springfield | \n", "Sep 1886. | \n", "26. | \n", "26.0 | \n", "Sanborn Fire Insurance Map from Springfield, C... | \n", "NaN | \n", "NaN | \n", "
15 | \n", "1891-11 | \n", "http://www.loc.gov/item/sanborn06900_002/ | \n", "sanborn06900_002 | \n", "clark county | \n", "ohio | \n", "springfield | \n", "Nov 1891. | \n", "39. | \n", "39.0 | \n", "Sanborn Fire Insurance Map from Springfield, C... | \n", "NaN | \n", "NaN | \n", "
16 | \n", "1894 | \n", "http://www.loc.gov/item/sanborn06900_003/ | \n", "sanborn06900_003 | \n", "clark county | \n", "ohio | \n", "springfield | \n", "1894. | \n", "61. | \n", "62.0 | \n", "Sanborn Fire Insurance Map from Springfield, C... | \n", "4 skeleton maps. Bound. | \n", "NaN | \n", "
17 | \n", "1888-01 | \n", "http://www.loc.gov/item/sanborn08380_001/ | \n", "sanborn08380_001 | \n", "robertson county | \n", "springfield | \n", "tennessee | \n", "Jan 1888. | \n", "3. | \n", "3.0 | \n", "Sanborn Fire Insurance Map from Springfield, R... | \n", "NaN | \n", "NaN | \n", "
18 | \n", "1893-04 | \n", "http://www.loc.gov/item/sanborn08380_002/ | \n", "sanborn08380_002 | \n", "robertson county | \n", "springfield | \n", "tennessee | \n", "Apr 1893. | \n", "4. | \n", "4.0 | \n", "Sanborn Fire Insurance Map from Springfield, R... | \n", "NaN | \n", "NaN | \n", "
19 | \n", "1898-01 | \n", "http://www.loc.gov/item/sanborn08380_003/ | \n", "sanborn08380_003 | \n", "robertson county | \n", "springfield | \n", "tennessee | \n", "Jan 1898. | \n", "5. | \n", "5.0 | \n", "Sanborn Fire Insurance Map from Springfield, R... | \n", "NaN | \n", "NaN | \n", "
20 | \n", "1885-06 | \n", "http://www.loc.gov/item/sanborn08950_001/ | \n", "sanborn08950_001 | \n", "springfield | \n", "vermont | \n", "windsor county | \n", "Jun 1885. | \n", "2. | \n", "2.0 | \n", "Sanborn Fire Insurance Map from Springfield, W... | \n", "NaN | \n", "NaN | \n", "
21 | \n", "1894-08 | \n", "http://www.loc.gov/item/sanborn08950_002/ | \n", "sanborn08950_002 | \n", "springfield | \n", "vermont | \n", "windsor county | \n", "Aug 1894. | \n", "2. | \n", "2.0 | \n", "Sanborn Fire Insurance Map from Springfield, W... | \n", "NaN | \n", "NaN | \n", "
Tip
\n", " \n", "When the cell above finishes running, scroll down to the bottom where you will see the `metadata_dataframe` displayed. Dataframes display like spreadsheets.\n", "\n", "Run the next cell to:
\n", "\n", "look for metadata about the downloaded atlas `sanborn02163_001`. You can change \"sanborn02163_001\" below for any other atlas id (aka, folder) that you've just downloaded.\n", "\n", "↓ ↓ ↓ ↓ ↓ ↓\n", "\n", " | date | \n", "id | \n", "id_simplified | \n", "location0 | \n", "location1 | \n", "location2 | \n", "notes0 | \n", "notes1 | \n", "page_count | \n", "title | \n", "notes2 | \n", "location3 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "1884-02 | \n", "http://www.loc.gov/item/sanborn02163_001/ | \n", "sanborn02163_001 | \n", "illinois | \n", "sangamon county | \n", "springfield | \n", "Feb 1884. | \n", "15. | \n", "15.0 | \n", "Sanborn Fire Insurance Map from Springfield, S... | \n", "NaN | \n", "NaN | \n", "
Found!
\n", "\n", "We can see that \"sanborn02163_001\" is a map of Springfield in Sangamon County, Illinois from February 1884. It has 15 sheets. (The number of sheets is in the `notes` field. We've also calculated it from the number of files in the JSON's `resource` field, and saved this to the `page_count` column.)\n", "\n", "Did we download any other atlases from the same Springfield in Sangamon County?
\n", "\n", "Let's see. We'll look in our dataframe for rows where the strings 'illinois', 'sangamon county', and 'springfield' appear in any of the location columns (alternatively, we could also search using the `title` field, but we'll stick with the `location` field for now.)\n", "\n", "Run the next cell to:
\n", "\n", "look for any other atlases we've downloaded for Springfield, Sangamon County, Illinois\n", "\n", "↓ ↓ ↓ ↓ ↓ ↓\n", "\n", " | date | \n", "id | \n", "id_simplified | \n", "location0 | \n", "location1 | \n", "location2 | \n", "notes0 | \n", "notes1 | \n", "page_count | \n", "title | \n", "notes2 | \n", "location3 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "1884-02 | \n", "http://www.loc.gov/item/sanborn02163_001/ | \n", "sanborn02163_001 | \n", "illinois | \n", "sangamon county | \n", "springfield | \n", "Feb 1884. | \n", "15. | \n", "15.0 | \n", "Sanborn Fire Insurance Map from Springfield, S... | \n", "NaN | \n", "NaN | \n", "
1 | \n", "1890-07 | \n", "http://www.loc.gov/item/sanborn02163_002/ | \n", "sanborn02163_002 | \n", "illinois | \n", "sangamon county | \n", "springfield | \n", "Jul 1890. | \n", "26. | \n", "26.0 | \n", "Sanborn Fire Insurance Map from Springfield, S... | \n", "NaN | \n", "NaN | \n", "
2 | \n", "1896 | \n", "http://www.loc.gov/item/sanborn02163_003/ | \n", "sanborn02163_003 | \n", "illinois | \n", "sangamon county | \n", "springfield | \n", "1896. | \n", "81. | \n", "81.0 | \n", "Sanborn Fire Insurance Map from Springfield, S... | \n", "Map of congested district. Bound. | \n", "NaN | \n", "
What if we want to analyze our overall results?
\n", "\n", "For example, what is the average number of pages in the atlases we downloaded?\n", "\n", "Pandas and Numpy (another Python module) offer many functions for running calculations on data in dataframes. Here, we'll use a simple one built into Pandas: the `.describe()` function. Used on the `page_count` column, `.describe()` can give us information such as:\n", "\n", "- the average number of pages in the Sanborn map atlases we've downloaded\n", "- the largest number of pages in any atlas we've downloaded\n", "- the smallest number of pages in any atlas we've downloaded\n", "- standard deviation of page counts\n", "\n", "Run the next cell to:
\n", "\n", "get a summary of the number of pages in the atlases we've downloaded (min, max, and mean).\n", "\n", "↓ ↓ ↓ ↓ ↓ ↓\n", "We can also visualize our results.
\n", "\n", "For example, we can analyze dates.\n", "\n", "Using a Python module called `matplotlib`, we can chart the frequency of atlases published, by year.\n", "\n", "Run the next cell to:
\n", "chart the downloaded atlases by publication year. \n", " \n", "↓ ↓ ↓ ↓ ↓ ↓\n", "Or we could chart how many pages are in each atlas
\n", "\n", "Run the next cell to:
\n", " \n", "chart the downloaded atlases by number of pages per atlas. \n", " \n", "↓ ↓ ↓ ↓ ↓ ↓\n", "Let's get a larger set of metadata, and visualize it with a choropleth map and timeline map.
\n", "\n", "Python has many visualization modules that build on Matplotlib. Here, we'll use one called `plotly`. Let's use it to visualize information about Sanborn mapping of Illinois counties over time. \n", "\n", "To do this, we'll need to get some extra metadata from Wikidata, about the counties' locations. \n", "\n", "Specifically, we'll:\n", "\n", "1. get metadata from loc.gov for all of the Illinois Sanborn atlases online\n", "2. get location information from Wikidata for each Illinois county\n", "3. map the atlases, by county and publication year\n", "\n", "To combine the loc.gov data with Wikidata data, we'll use something called a SPARQL query. This is how data is retrieved from Wikidata, using SPARQL.\n", "\n", "First, let's write a new function that gets metadata about Illinois Sanborns from loc.gov.
\n", "\n", "For our location metadata, we need to get county names. In order to minimize the number of API requests we'll make, let's parse the county names from the `title` field rather than from the `location` metadata field. This will allow us to get all our information from our query results, without having to reach atlas's individual item records. The `location` field is in the item records, but isn't in the query result records. \n", "\n", "The county metadata is a little messy. We'll need to account for various types of \"counties\", as in these examples:\n", "\n", "* \"Belle Glade, Palm Beach County, Florida\" (standard pattern)\n", "* \"Cape Girardeau, Cape Girardeau, Missouri\" (counties missing the word \"County\")\n", "* \"Mankato, Blue Earth And Nicollet Counties, Minnesota\" (two counties)\n", "* \"Kansas City, Jackson, Clay, And Platte Counties, Missouri\" (three or more counties)\n", "* \"Saint Louis, Independent City, Missouri\" (cities without counties)\n", "* \"New Orleans, Orleans Parish, Louisiana\" (county-like entities)\n", "* \"Juneau, Juneau Census Division, Alaska\" (county-like entities whose status has changed over time)\n", "\n", "We'll tackle this inside our function, with a series of regular expressions. More details are in the comments within the function.\n", "\n", "Run the next cell to:
\n", " \n", "define the `get_Sanborn_counties` function, which creates a dataframe of atlases based on an API query. Any atlas that is labelled with more than one county (or county-like entity) will have a row for each county. \n", " \n", "↓ ↓ ↓ ↓ ↓ ↓\n", "\n", "Run the next cell to:
\n", " \n", "run the `get_Sanborn_counties` function to retrieve and clean metadata from loc.gov, for Illinois Sanborn atlases.\n", "\n", "The output will be the dataframe `atlases_df`, and the first 6 lines will display.\n", "\n", "This cell may take a long time to run!\n", " \n", "↓ ↓ ↓ ↓ ↓ ↓\n", "\n", " | city | \n", "count | \n", "county | \n", "date | \n", "state | \n", "year | \n", "
---|---|---|---|---|---|---|
0 | \n", "Abingdon | \n", "1 | \n", "Knox | \n", "1893-08 | \n", "Illinois | \n", "1893 | \n", "
1 | \n", "Abingdon | \n", "1 | \n", "Knox | \n", "1898-09 | \n", "Illinois | \n", "1898 | \n", "
2 | \n", "Abingdon | \n", "1 | \n", "Knox | \n", "1906-03 | \n", "Illinois | \n", "1906 | \n", "
3 | \n", "Abingdon | \n", "1 | \n", "Knox | \n", "1912-01 | \n", "Illinois | \n", "1912 | \n", "
4 | \n", "Albion | \n", "1 | \n", "Edwards | \n", "1894-06 | \n", "Illinois | \n", "1894 | \n", "
5 | \n", "Albion | \n", "1 | \n", "Edwards | \n", "1900-12 | \n", "Illinois | \n", "1900 | \n", "
Run the next cell to:
\n", " \n", "view all of the unique county names in the `atlases_df` dataframe -- aka, Illinois counties with online Sanborn atlases.\n", " \n", "↓ ↓ ↓ ↓ ↓ ↓\n", "Great!
\n", "\n", "Now we have a dataframe of all the Illinois Sanborn atlases, split up so that there is one row per county per atlas. \n", "\n", "In order to map data, mapping tools needs access to information about the counties' locations. There are lots of ways to get this information, and different types of location information we can get.\n", "\n", "For one of our maps, we'll use a geojson file from Plotly that matches counties' \"FIPS\" codes with polygons (county borders). FIPS codes are used by the Census Bureau and other U.S. federal agencies to uniquely identify counties and county-like entities. For another map, we'll use coordinates for counties' center points. \n", "\n", "We'll go to Wikidata to get the necessary data: \n", "- county FIPS code \n", "- county coordinates, which contains the center point latitude and longitude\n", "\n", "First, we need to reorganize our dataframe so that we have one row per county. To do this, we can use the Pandas function `.groupby()`.\n", "\n", "Run the next cell to:
\n", " \n", "reorganize the `atlases_df` dataframe into a dataframe of counties, with one county per row, and save the new dataframe to `counties_df`. The `count` column will be the total count of atlases for that county.\n", "\n", "The cell below will print the first 5 counties.\n", " \n", "↓ ↓ ↓ ↓ ↓ ↓\n", "\n", " | county | \n", "state | \n", "count | \n", "
---|---|---|---|
0 | \n", "Adams | \n", "Illinois | \n", "14 | \n", "
1 | \n", "Alexander | \n", "Illinois | \n", "8 | \n", "
2 | \n", "Bond | \n", "Illinois | \n", "9 | \n", "
3 | \n", "Boone | \n", "Illinois | \n", "6 | \n", "
4 | \n", "Brown | \n", "Illinois | \n", "7 | \n", "
SPARQL queries
\n", " \n", "A good starting point to learn about Wikidata SPARQL queries is the [Wikidata:SPARQL tutorial](https://www.wikidata.org/wiki/Wikidata:SPARQL_tutorial). SPARQL is an RDF query language broadly used for linked open data. \n", "\n", "Run the next cell to:
\n", " \n", "define the `get_fips_coords` function, which retrieves counties' FIPS codes and coordinates from Wikidata.\n", " \n", "↓ ↓ ↓ ↓ ↓ ↓\n", "Run the next cell to:
\n", " \n", "get county FIPS codes and coordinates from Wikidata, and merge this infomration back into the `counties_by_year` dataframe. Returns out a sample of 5 rows. \n", " \n", "↓ ↓ ↓ ↓ ↓ ↓\n", "\n", " | county | \n", "state | \n", "year | \n", "count | \n", "count_bycounty | \n", "fips | \n", "lat | \n", "long | \n", "
---|---|---|---|---|---|---|---|---|
0 | \n", "Adams | \n", "Illinois | \n", "1883 | \n", "1 | \n", "14 | \n", "17001 | \n", "39.990000 | \n", "-91.190000 | \n", "
144 | \n", "Cook | \n", "Illinois | \n", "1883 | \n", "1 | \n", "123 | \n", "17031 | \n", "41.800000 | \n", "-87.716667 | \n", "
1010 | \n", "Vermilion | \n", "Illinois | \n", "1884 | \n", "1 | \n", "30 | \n", "17183 | \n", "40.180000 | \n", "-87.740000 | \n", "
604 | \n", "Logan | \n", "Illinois | \n", "1884 | \n", "1 | \n", "16 | \n", "17107 | \n", "40.130000 | \n", "-89.360000 | \n", "
905 | \n", "Saint Clair | \n", "Illinois | \n", "1884 | \n", "2 | \n", "44 | \n", "17163 | \n", "38.470000 | \n", "-89.930000 | \n", "
935 | \n", "Sangamon | \n", "Illinois | \n", "1884 | \n", "1 | \n", "25 | \n", "17167 | \n", "39.760000 | \n", "-89.660000 | \n", "
1071 | \n", "Whiteside | \n", "Illinois | \n", "1884 | \n", "3 | \n", "25 | \n", "17195 | \n", "41.750000 | \n", "-89.910000 | \n", "
259 | \n", "Edgar | \n", "Illinois | \n", "1884 | \n", "1 | \n", "16 | \n", "17045 | \n", "39.680000 | \n", "-87.750000 | \n", "
36 | \n", "Bureau | \n", "Illinois | \n", "1885 | \n", "1 | \n", "35 | \n", "17011 | \n", "41.410000 | \n", "-89.530000 | \n", "
760 | \n", "Mercer | \n", "Illinois | \n", "1885 | \n", "1 | \n", "13 | \n", "17131 | \n", "41.200000 | \n", "-90.740000 | \n", "
373 | \n", "Henry | \n", "Illinois | \n", "1885 | \n", "2 | \n", "27 | \n", "17073 | \n", "41.350000 | \n", "-90.140000 | \n", "
519 | \n", "La Salle | \n", "Illinois | \n", "1885 | \n", "1 | \n", "50 | \n", "17099 | \n", "41.345556 | \n", "-88.842500 | \n", "
357 | \n", "Hancock | \n", "Illinois | \n", "1885 | \n", "2 | \n", "35 | \n", "17067 | \n", "40.400000 | \n", "-91.170000 | \n", "
542 | \n", "Lake | \n", "Illinois | \n", "1885 | \n", "1 | \n", "35 | \n", "17101 | \n", "38.720000 | \n", "-87.730000 | \n", "
58 | \n", "Carroll | \n", "Illinois | \n", "1885 | \n", "2 | \n", "12 | \n", "17015 | \n", "42.060000 | \n", "-89.920000 | \n", "
578 | \n", "Lee | \n", "Illinois | \n", "1885 | \n", "2 | \n", "20 | \n", "17103 | \n", "41.750000 | \n", "-89.300000 | \n", "
722 | \n", "McLean | \n", "Illinois | \n", "1885 | \n", "3 | \n", "39 | \n", "17113 | \n", "40.490000 | \n", "-88.850000 | \n", "
705 | \n", "McHenry | \n", "Illinois | \n", "1885 | \n", "3 | \n", "36 | \n", "17111 | \n", "42.320000 | \n", "-88.450000 | \n", "
338 | \n", "Grundy | \n", "Illinois | \n", "1885 | \n", "3 | \n", "26 | \n", "17063 | \n", "41.290000 | \n", "-88.430000 | \n", "
590 | \n", "Livingston | \n", "Illinois | \n", "1885 | \n", "3 | \n", "28 | \n", "17105 | \n", "40.890000 | \n", "-88.560000 | \n", "
330 | \n", "Greene | \n", "Illinois | \n", "1885 | \n", "4 | \n", "20 | \n", "17061 | \n", "39.350000 | \n", "-90.390000 | \n", "
983 | \n", "Tazewell | \n", "Illinois | \n", "1885 | \n", "1 | \n", "30 | \n", "17179 | \n", "40.510000 | \n", "-89.510000 | \n", "
323 | \n", "Gallatin | \n", "Illinois | \n", "1885 | \n", "1 | \n", "7 | \n", "17059 | \n", "37.760000 | \n", "-88.230000 | \n", "
683 | \n", "Marshall | \n", "Illinois | \n", "1885 | \n", "1 | \n", "16 | \n", "17123 | \n", "41.030000 | \n", "-89.340000 | \n", "
853 | \n", "Pike | \n", "Illinois | \n", "1885 | \n", "3 | \n", "15 | \n", "17149 | \n", "39.620000 | \n", "-90.890000 | \n", "
950 | \n", "Scott | \n", "Illinois | \n", "1885 | \n", "1 | \n", "5 | \n", "17171 | \n", "39.650000 | \n", "-90.480000 | \n", "
30 | \n", "Brown | \n", "Illinois | \n", "1885 | \n", "1 | \n", "7 | \n", "17009 | \n", "39.950000 | \n", "-90.750000 | \n", "
1086 | \n", "Will | \n", "Illinois | \n", "1885 | \n", "1 | \n", "26 | \n", "17199 | \n", "37.730000 | \n", "-88.930000 | \n", "
812 | \n", "Ogle | \n", "Illinois | \n", "1885 | \n", "1 | \n", "28 | \n", "17141 | \n", "42.040000 | \n", "-89.320000 | \n", "
1 | \n", "Adams | \n", "Illinois | \n", "1885 | \n", "1 | \n", "14 | \n", "17001 | \n", "39.990000 | \n", "-91.190000 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
664 | \n", "Madison | \n", "Illinois | \n", "1947 | \n", "1 | \n", "32 | \n", "17119 | \n", "38.830000 | \n", "-89.910000 | \n", "
185 | \n", "Cook | \n", "Illinois | \n", "1948 | \n", "2 | \n", "123 | \n", "17031 | \n", "41.800000 | \n", "-87.716667 | \n", "
567 | \n", "Lake | \n", "Illinois | \n", "1948 | \n", "1 | \n", "35 | \n", "17101 | \n", "38.720000 | \n", "-87.730000 | \n", "
493 | \n", "Kankakee | \n", "Illinois | \n", "1948 | \n", "1 | \n", "24 | \n", "17091 | \n", "41.140000 | \n", "-87.860000 | \n", "
645 | \n", "Macoupin | \n", "Illinois | \n", "1948 | \n", "1 | \n", "37 | \n", "17117 | \n", "39.260000 | \n", "-89.920000 | \n", "
15 | \n", "Alexander | \n", "Illinois | \n", "1949 | \n", "1 | \n", "8 | \n", "17003 | \n", "37.190000 | \n", "-89.340000 | \n", "
923 | \n", "Saint Clair | \n", "Illinois | \n", "1949 | \n", "1 | \n", "44 | \n", "17163 | \n", "38.470000 | \n", "-89.930000 | \n", "
665 | \n", "Madison | \n", "Illinois | \n", "1950 | \n", "1 | \n", "32 | \n", "17119 | \n", "38.830000 | \n", "-89.910000 | \n", "
435 | \n", "Jefferson | \n", "Illinois | \n", "1950 | \n", "1 | \n", "15 | \n", "17081 | \n", "38.300000 | \n", "-88.920000 | \n", "
682 | \n", "Marion | \n", "Illinois | \n", "1950 | \n", "1 | \n", "22 | \n", "17121 | \n", "38.650000 | \n", "-88.920000 | \n", "
739 | \n", "McLean | \n", "Illinois | \n", "1950 | \n", "1 | \n", "39 | \n", "17113 | \n", "40.490000 | \n", "-88.850000 | \n", "
1126 | \n", "Winnebago | \n", "Illinois | \n", "1950 | \n", "2 | \n", "20 | \n", "17201 | \n", "42.330000 | \n", "-89.160000 | \n", "
631 | \n", "Macon | \n", "Illinois | \n", "1950 | \n", "1 | \n", "20 | \n", "17115 | \n", "39.860000 | \n", "-88.960000 | \n", "
129 | \n", "Clinton | \n", "Illinois | \n", "1950 | \n", "1 | \n", "21 | \n", "17027 | \n", "38.610000 | \n", "-89.420000 | \n", "
473 | \n", "Kane | \n", "Illinois | \n", "1950 | \n", "2 | \n", "39 | \n", "17089 | \n", "41.950000 | \n", "-88.433333 | \n", "
894 | \n", "Rock | \n", "Wisconsin | \n", "1950 | \n", "1 | \n", "3 | \n", "55105 | \n", "42.670000 | \n", "-89.070000 | \n", "
903 | \n", "Rock Island | \n", "Illinois | \n", "1950 | \n", "2 | \n", "21 | \n", "17161 | \n", "41.470000 | \n", "-90.570000 | \n", "
924 | \n", "Saint Clair | \n", "Illinois | \n", "1950 | \n", "2 | \n", "44 | \n", "17163 | \n", "38.470000 | \n", "-89.930000 | \n", "
944 | \n", "Sangamon | \n", "Illinois | \n", "1950 | \n", "2 | \n", "25 | \n", "17167 | \n", "39.760000 | \n", "-89.660000 | \n", "
1055 | \n", "Washington | \n", "Illinois | \n", "1950 | \n", "1 | \n", "19 | \n", "17189 | \n", "38.350000 | \n", "-89.420000 | \n", "
186 | \n", "Cook | \n", "Illinois | \n", "1950 | \n", "5 | \n", "123 | \n", "17031 | \n", "41.800000 | \n", "-87.716667 | \n", "
1025 | \n", "Vermilion | \n", "Illinois | \n", "1951 | \n", "1 | \n", "30 | \n", "17183 | \n", "40.180000 | \n", "-87.740000 | \n", "
720 | \n", "McHenry | \n", "Illinois | \n", "1953 | \n", "1 | \n", "36 | \n", "17111 | \n", "42.320000 | \n", "-88.450000 | \n", "
740 | \n", "McLean | \n", "Illinois | \n", "1953 | \n", "1 | \n", "39 | \n", "17113 | \n", "40.490000 | \n", "-88.850000 | \n", "
474 | \n", "Kane | \n", "Illinois | \n", "1953 | \n", "1 | \n", "39 | \n", "17089 | \n", "41.950000 | \n", "-88.433333 | \n", "
721 | \n", "McHenry | \n", "Illinois | \n", "1955 | \n", "1 | \n", "36 | \n", "17111 | \n", "42.320000 | \n", "-88.450000 | \n", "
187 | \n", "Cook | \n", "Illinois | \n", "1956 | \n", "1 | \n", "123 | \n", "17031 | \n", "41.800000 | \n", "-87.716667 | \n", "
1004 | \n", "Tazewell | \n", "Illinois | \n", "1956 | \n", "1 | \n", "30 | \n", "17179 | \n", "40.510000 | \n", "-89.510000 | \n", "
904 | \n", "Rock Island | \n", "Illinois | \n", "1957 | \n", "1 | \n", "21 | \n", "17161 | \n", "41.470000 | \n", "-90.570000 | \n", "
475 | \n", "Kane | \n", "Illinois | \n", "1958 | \n", "1 | \n", "39 | \n", "17089 | \n", "41.950000 | \n", "-88.433333 | \n", "
1142 rows × 8 columns
\n", "Now we have the FIPS codes and coordinates from Wikidata.
\n", "\n", "Scroll to the bottom of the results above to look at the dataframe table. \n", "\n", "Notice Cook county high on the list? Chicago is in Cook County, which explains why the `county_bycounty` value (123) is so high. `county_bycounty` is the total number of online atlase volumes for Cook county across all years.\n", "\n", "Our data is ready for mapping.
\n", " \n", "First, let's make what's called a choropleth map, where Illinois's counties are colored according to how many atlas volumes are online for that county. This visualization relies on the FIPS codes we pulled from Wikidata. \n", "\n", "Run the next cell to:
\n", " \n", "generate a choropleth map of all Illinois Sanborn atlases online at loc.gov, by county.\n", "\n", "(Tip: zoom in to get a closer look at Illinois!)\n", " \n", "↓ ↓ ↓ ↓ ↓ ↓\n", "Next, let's make a proportional dot map, with a timeline at the bottom.
\n", "\n", "This visualization relies on the coordinates we pulled from Wikidata. \n", "\n", "Run the next cell to:
\n", " \n", "generate a timeline map showing the number of Illinois Sanborn atlases online at loc.gov, by county and year.\n", " \n", "↓ ↓ ↓ ↓ ↓ ↓\n", "Tip
\n", " \n", "Press play on the timeline, or pull the timeline to see atlases published by year.\n", "\n", "Atlases with publication dates in the mid-20th century are usually atlases originally published earlier, with updates later added. \n", "\n", "\n", "